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Revision History 
Revision of April 25, 2017 


The section on inline assembly code has been revised to indicate the use of the frame pointer instead of the stack 
pointer for accessing local variables and arguments on the stack. 


Revision of April 17, 2017 


This is the first version of this document. 


Introduction 


This document is a User's Guide to the C compiler for the CSR Kalimba 3 24 bit DSP. This document assumes that you 
know C and that you know the Kalimba 3 instruction set. 


For full information about the processor, please see CSR’s Kalimba Architecture 3 DSP User Guide. 
The Archelon Kalimba 3 C compiler generates code for the subset of the architecture that generally provides all of the 
functions of a general purpose processor. It does not generate any code that uses any conditional prefixes, nor does it 


generate code which uses the special functions of the Address Generator. However, the compiler provides excellent 
support for in-line assembly code, with the ability to reference C variables in inline assembly code. 


The C preprocessor 


The preprocessor completely conforms to the ANSI/ISO Standard 9899-1990, so it should do everything which you 
expect it to do. 


For quick reference, the standard directives are 








define - define a one line macro 

include - include a file 

if - test if an expression is non-zero 
ifdef - test if a symbol is defined 

ifndef - test if a symbol is undefined 

else - start alternate part of a #if 

endif -close outa #if/#ifdef/#ifndef 


The preprocessor is automatically invoked on any C file. 


Archelon's C preprocessor supports some additional directives, which are intended to make it more useful as a 
preprocessor for assembler code. The extra directives are 








macro - start a multi-line macro 

endm - end a multi-line macro 

set - set a preprocessor symbol to a value 
rept - start a repeat block 

endr - end a repeat block 


#error - display the arguments on the error output and terminate execution 
#warning - display the arguments on the error output and continue 


The multi-line macro facility works much like #define, except that it can span multiple lines instead of being expanded 
inside one line. This makes it very useful for writing macros for assembler code. Instead of writing 


#define name( argl, arg2 ) some text 
you can write 

#macro name( argl, arg2 ) 

macro line 1 

macro line 2 


#endm 


Inside any kind of macro, you can use the # operator, which, when followed by a macro argument name, turns the 
actual argument into a string. Also, you can use the ## operator to do concatenation (also known as "token pasting"); 


The #set directive has the form 





#set nam xpression 


The value of name will be the number, which results from evaluating the arithmetic expression. This differs from 
#define, which defines its argument name to be a string. You can use #set to do things like generate unique label 
numbers during multi-line macro expansions. 


The repeat block allows you to expand a sequence of lines a number of times. 
For instance 


#rept 2 
line 1 
line 2 
line 3 
#endr 


will expand to 


line 
line 
line 
line 
line 
line 
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The preprocessor provides the following predefined symbols: 











LINE | - current line number 

__ FILE | - current source file 

_ DATE ` - date 

TIME | - time 

_ STDC - always set to 1 
ARCHELON - always set to | 

__kalimba__ o - always set to 1 


Kalimba 3 C 


The C compiler for the CSR Kalimba 3 generally conforms to the ANSI/ISO Standard 9899-1990. 


Deviations from the C Standard 


Floating point is not supported. 
The Kalimba C compiler deviates from the C standard in the it treats global static variables the same way as global 


variables — i.e. their scope is not restricted to the current file and they can be accessed by code in other files. However, 
local static variables (static variables declared within a function body) are handled as the C standard requires. 


Enhancements to C 


The Archelon Kalimba 3 C compiler supports C++ style comments, which begin with // and extend to the end of the 
current line, and also allows you intermix variable declarations with executable statements, instead of requiring that all 
variable declarations appear at the beginning of a compound statement. These features are part of the 1999 C standard. 


The Archelon Kalimba C compiler also supports a non-standard loop statement, in order to allow you an easy way to 
have your C code use the Kalimba’s hardware loop features. The basic syntax for the Loop statement is 


loop (NI $ 


where Nis an expression and S is a statement. This construct causes the statement S to be repeated N times. If N is zero, 
then S is skipped. 


The Archelon Kalimba 3 C compiler supports fractional integral types, using the keyword frac instead of signed 
or unsigned. See below for more details about fractional types. 


The Archelon Kalimba 3 C compiler supports non-standard additional syntax for binding register variables to be within 
a specified register set or to be in a specific register or register pair. 


The Archelon Kalimba 3 C compiler allows you to write inline assembly code which contains symbolic references to 
variables declared in C. 


Types 


Since the Kalimba 3 is a 24 bit word-addressable machine, the sizes of the basic types are as follows: 


type width in bits format 

char 24 bit two's complement 
short 24 bit two's complement 

int 24 bit two's complement 
pointer 24 bit two's complement 
long 48 bits two's complement 
float 48 bits NOT IMPLEMENTED 
double 48 bits NOT IMPLEMENTED 


Signed-ness of char 


The default signed-ness of the char type is implementation dependent. In the Kalimba 3 C compiler, the char type is 
signed by default (i.e. it is signed unless you write unsigned char). 


Bit Fields 


Bit fields are unsigned by default. If you want to make a bit field be signed, then you must include the signed 
keyword in the declaration. Bit locations are assigned from right to left. 


Alignment of Data 


Since there are no type alignment restrictions in the Kalimba, all types are aligned mod 1. 


Adjacency of data 


In the Kalimba C compiler, there is no guarantee that a pair of adjacent global or static declarations, neither of which 
have initial values, will be emitted in the order in which they were written. So writing something like 


int x; 
char buffer[256]; /* buffer and x may be disjoint */ 


will not guarantee that buffer will follow x. If you need buffer to immediately follow x, then you must provide 
initial values for both, as in 


int x = 0; 
char buffer[256] = { 0 }; /* buffer will immediately follow x */ 





which will cause the compiler to emit the assembly file definitions in the order in which they were written. 


Fixed Point Types 


The char, int, and short types may be signed (the default), unsigned, or frac. The frac type is a signed 
fractional (or fixed point) type. The leftmost bit of a frac is the sign bit. The remaining bits represent a fraction in 
the range from 0 to (nearly) 1. Soa _ frac type can represent values in the range from -1 to (nearly) 1 inclusive. It is 
sort of like a poor man's floating point type. 


The long types are stored as two words, in big-endian order (the msb word first, then the lsb word). 


1 


You can write _frac constants by using a floating point constant in the range from -1.0 to 1.0, suffixed by `r' or `R'. 
For instance 


0.125r 
0.99R 


For convenience, "1.0r" is taken to mean the largest possible positive _frac constant (Ox7fffff on the Kalimba 3). 

Cast operations between _frac and non_frac integers simply pass the value without change, so you can write things like 
-frac int <a; 
a = Oxle; 


If you otherwise mix frac and non- frac operands in an expression, then the operation will be done asa frac 
operation. 


In general, the rules for frac types are similar to the rules for floating point types. This means that remainder 
operation is not defined. Although shifting is not defined for floating point operands, the compiler will allow you to 
shift fractional types, without having to go to the bother of a lot of type casting back and forth, because it is often 
convenient to be able to do so. 


The long fractional multiply returns the most significant 48 bits of the 96 bit result of a long multiplication, after it has 
been left shifted by one to eliminate the double sign bit. 


The short fractional multiply returns the most significant 24 bits of the 48 bit result of a short multiplication, after it has 
been left shifted by one to eliminate the double sign bit.. 


Fractional divide will saturate the result. If the result would be greater than or equal to 1.0, then fractional division 
returns the largest positive fractional number (1.0r). If the result would be less than -1.0, then fractional division will 
return -1.0r. Otherwise, the fractional result will a number in the range from -1.0 to 1.0r (nearly one). 


Saturation 


For signed addition, subtraction, and multiplication, the CSR Kalimba 3 can saturate the result in case of an overflow 
or underflow. If a saturated operation overflows, the result is the most positive signed integer. If the saturated 
operation underflows, the result is the most negative signed integer. 


To do a 24 bit signed saturated multiply, you must use the satimul function, which is defined in the include file 
kalimba3 .h. For instance, 


#include "“kalimba3.h" 
inte a "rer C} 


main () 


{ 


a = _satimul(b, ci: 


Note that if you call satimul without including kalimba3.h, the you will see a linker error complaining that 
_satimul is undefined. 


Saturation of add and subtract operations is done differently. Instead of using a function, you must set or reset a 
processor control register which determines whether or not an add or subtract operation will saturate on overflow or 
underflow. To enable saturation, you must set the memory mapped control register ARITHMETIC MODE to one. The 
location of the register depends on the version of the Kalimba 3 that you are using. You must declare the register in 
your code using the sfrw keyword. For instance, if your version of the Kalimba 3 is “dale”, “gordon”, or “gemini”, then 
the address is Oxfffe93. If it is “rick”, then the address is Oxfffe05. 








To disable saturation, you must set it to zero. Initially, saturation is disabled. Once enabled, saturation remains in effect 
until it is disabled. Here is an example, for “rick”: 


sfrw ARITHM 
int a, 


main () 


{ 








ETIC MODE 


C7 


ARITHMETIC MOD! 


+ C; 


Oxfffe05; 








ARITHMETIC MOD! 
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ll 
Oo 
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Placement of Data in Memory 


The data memory map CSR Kalimba 3 processor includes two distinct data memory areas, and also has the ability to 
use flash memory for read-only data. For certain applications using circular buffers, there is also a need to be able to 
position arrays such that you can use them as circular buffers. All data memories share a common address space. 


The Archelon Kalimba 3 C compiler addresses these needs by using memory type keywords and a command line 
option to allow you to direct the compiler as to how you want your data placed in memory. 


The memory type keywords are DMI 














EM1, DMEM2, DMEM3, DMEM4,and DMI 
keywords on global or static variables. If you do not use any of these keywords, then _ DMI 











To use any of these keywords, just write one them at the beginning of a variable declaration. For instance, 


int x; 


DM 








/* uses default _DMEM1 */ 
EM2 int y; 


EM5. You can only use these 
EM1 is the default. 





Variables in the default memory go into the first data memory, called DM1 by the assembler. Variables in _DMEM2 go 
into the second data memory, called DM2 by the KAS assembler. Variables in DMEM3, DMEM4, or DMI 
allocated with an alignment suitable for a circular buffer, as follows: 
































Keyword Assembler declaration Memory used 
DMEM3 . VAR/DMCIRC DM1 or DM2 
_DMEM4 . VAR/DM1CIRC DM1 
DMEM5 . VAR/DM2CIRC DM2 

















For instance, to declare a 128 word circular buffer in DM2, you would write 


DM 





EM4 int my buffer[128]; 


KMD are 


The compiler can optionally also put certain kinds of data in flash memory. If you use the MCC command line option 


-Vuse_flash=1 


Then the compiler will put all switch statement jump tables, all string constants (unless string constants have been 
made writable by using the +wrstr command line option), and any data declared using the C const keyword in flash 
memory, using the assembler syntax 


. VAR/FLASHDATA 


However, this will not apply to variables declared with circular buffer alignment using DMEM3, DMEM4, or 
DMEM5, since the assembler does not provide a way to properly align the data for circular buffer use in flash memory. 











Note that the -Vuse_flash=1 is off by default, because the assembler’s default groups . asm file does not contain 
an entry for FLASHDATA. Before using this option, you must modify your groups. asm file to define the 
FLASHDATA segment. 


Placement of Code in memory 
By default, all C code is placed in code RAM. 


To put code into flash memory, I suggest you use inline assembler code outside of function bodies. Although I am not 
at all sure of what the exact syntax should be, it might be something like this: 


void 

ram func (void) 
{ 

} 


/$ 

- ENDMODULE; 

.MODULE $M.filename flash; 
-CODESEGMENT PM FLASH; 

- DATASEGMENT DM; 

$/ 


























void 

flash_func (void) 
{ 

} 


Inline functions 


If your function declaration includes the keyword ` inline `. then the compiler will expand the function inline at 
each point that it is called. 


Even though you may write your code so that every invocation of an inline function can be expanded inline, it may also 
be necessary for the compiler to produce a copy of the function which can be called directly, in case a function in some 
other file calls it or in case the function gets called indirectly. 


If your inline function will only be used in a single file and if you will use it only after you declare it, you should 
use static. If the storage class of the inline function is static, then the compiler will not produce a callable 
expansion of the function unless your code also takes the address of the function. 


If you want to put your inline function in a header file, you should use extern. If the storage class of the inline 
function is extern, then the compiler will not produce a callable expansion of the function at all; if you need one, 
then you should provide a separate declaration without the extern keyword. 


In the absence of either a static or extern keyword, the compiler must always also generate a callable copy of the 
function. If you have an inline function declared extern in a header file and if the function may be called indirectly, 
you must provide, in a separate file, another copy of the same code, but without the extern keyword. 


Interrupt functions 


The Kalimba C compiler allows you to declare functions which can be entered from the interrupt vector by using the 
keyword INTERRUPT in the function declaration. For instance, here is a simple example of an interrupt function: 





volatile int got_interrupt; 





INTERRUPT void func( void ) 


{ 
got_interrupt = 1; 
} 


On entering/exiting an interrupt function the compiler will save/restore the entire accumulator register, plus rO and r1, 
along with any other registers used within the body of the function. 

To connect an interrupt vector to an interrupt function, you will need write assembler code to separately initialize the 
interrupt vector with the address of the function, using the function name prefixed by $ (i.e. “$func” in the above 
example). 


#Pragma directives 


You use #pragma directives to give the compiler further information on how to compile a program. Although every C 
compiler has its own set of #pragma directives, the ability to compile code containing a different compiler’s pragmas 
is supported by the rule that, if a compiler does not recognize a compiler directive, then it ignores the directive. 


In this section, we will discuss those #pragma directives which the KALIMBA C compiler supports and which you 
are likely to want to use. In general, and unless otherwise noted, you should only use an KALIMBA 3 C compiler 
pragma between functions. 


Global Optimizer pragmas 
The global optimizer provides an additional, higher level of optimization compared to the default. However, because of 
the aggressive nature of the various optimizations, it is not possible to generate symbolic debug tables for functions, 
which have been compiled with the global optimizer turned on. Although you can turn on the optimizer for all files, by 
setting “+0” in the C command line in the Miscellaneous tab of the Project Options dialog box of the IDE, you will 
probably want to use the finer control over optimization, which is afforded by these pragmas. 
If you want to turn on the global optimizer, then type 

#pragma +O 
If you want to turn off the global optimizer, then type 

#pragma -0O 
Because you cannot turn the global optimizer on or off inside a function, these pragmas will only take effect starting 
at the beginning of the next function. Each remains in effect until the compiler sees a different optimizer pragma or 
reaches the end of the current source file. 

Inline Assembler pragmas 
By default, the compiler assumes that inline assembly code does not contain any call instructions. This is necessary, 
since the compiler does not know what is going on inside a block of inline assembly code. By assuming that there are 


no calls, the compiler can be more optimistic about the number of registers it needs to save/restore on entry/exit, 
making for code that is smaller and faster overall. 


In the event that you wish to use a call instruction in a block of inline assembly code, then you should precede the 
function with the following pragma: 


#pragma inline asm_uses call true 
After the function body, you should restore the initial default by writing 
#pragma inline asm_uses call false 
For example, you might write something like this: 
#pragma inline asm_uses call true 
void f( void ) 
{ 
/$ 
call 
$/ 
} 


#pragma inline asm_uses call false 


Binding variables to registers 


If you write an assembly code wrapper function, you will need to be able to associate a variable with a particular 
register set or a particular register. Archelon C provides a way for you to do this. 


To begin with, the following table shows the internal register set names used by the compiler, and how each register in 
each set maps to a Kalimba register name. 





Register Set | Register Number | Assembler Name 
bankI ro 

rl 

r2 

r3 

r4 

r5 

r6 

r7 

r8 

r9 
r10 
Null 
rMAC 
rLink 
rFlags 
rMACB 
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To bind a C variable to a particular register set, you just need to use the register keyword and provide the register set 
name, prefixed by underbar. For instance, 


register AG2 int YD: 


This will associate a register, in the AG2 set with the pointer variable p. The compiler will determine the actual register 
number chosen within the AG2 set. 


To bind a C variable to a particular register or register pair, you must use a slightly different syntax: 
register int *p @ <register setz [| <register_number> ] ; 

For instance, 
register int *p @ AG1[2]; 
register int q @ M[0]; 


register long 1 @ bankI[2]; 


For the declaration of the long variable above, the compiler will allocate two registers — r2 and r3. 


Inline Assembly Code 


The Kalimba C compiler makes it very easy to drop into assembly code inside a C function. 

To put assembly code inside your C function, simply write /$ and then start writing assembly code. At the end of the 
assembly code, write $/ to revert back to C code. As you might expect, you can put as many lines of assembly code as 
you wish between the /S and the $/. 

When writing inline code, you can access any of the C variables currently in scope by entering 


@name 


where name is the name of the variable. The string @name will be replaced by a value depending on its storage class 
as follows: 


Storage Class ` Value 


extern $name 

global $name 

local static file Vn (a generated name) 

auto a positive offset from the frame pointer 
argument a negative offset from the frame pointer 
register the register name 
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If the variable is declared using the register keyword, and if the type of the variable is large enough to require it to 
be placed in two or more registers, then you can optionally select which register to which you wish to refer by suffixing 
the name with a dot followed by a single digit. For instance, suppose you declare a long variable called x. Since a 
long is 48 bits, x will be in two registers. Suppose the compiler puts x into r4 and r5. Then, in inline code, @x or 
@x.0 will be replaced by r4 (which will contain the most significant bits), while @x.1 will be replaced by r5 
(which will contain the least significant bits). 


When referring to variables in in-line assembly code, you must use the name in the correct context. If the storage class 
is register, then you just use @variable. If the storage class is auto (i.e. local to the function and either marked 
auto or else not register and not static), then you must use “M[r9 + @variable]”. If the storage class is 
global or static, then you must use “M[Null + @variable]”. For instance, 


Tne ss 


void test( register int y ) 


{ 


TEES E 


/$ 


; load a static or global into rU 
rO = M[Null + @x]; 


; load a register variable into r0 
r0 = Du: 


; load a stack variable into rU 
TU = M[r9 + @z]; 
$/ 





If you need to put an actual @ in your code , you must precede the @ character with a \ (backslash) character to prevent 
the compiler from thinking you want to refer to a variable. 


You can also get at structure offsets by using the construct 


@tag.member name[.member name] * 





where tag is the "structure tag" as defined in C and member name is the name of one of the members of that 
structure. While the member name has struct or union type, you can append an additional member name. The whole 
thing is replaced by a constant which is the offset from the start of the original structure to the specified member. 


For instance, if you have declared the structure 


struct mystruct { 
inte “ays By. 


struct { 
ATES Xs 
int y? 
eer 


then you can obtain the constant offset to member y in in-line assembly code by writing 


@mystruct.c.y 


12 


You can also obtain the size of a C object in in-line assembly code by writing 


@sizeof( expression ) 
where expression is any C expression which is a legal argument to the C sizeof built-in function. 


In order to refer to a variable in in-line code, you and the compiler must agree on where the variable is located. For 
variables declared outside of functions, this is not a problem, because the variable is always in memory unless you 
declared it as a global register variable, in which case it is in a register. For local variables, the compiler takes special 
actions to make sure that it puts the variable where you expect to find it. If you did not declare the variable with a 
register keyword, then the compiler will always make sure to put the variable in memory. 


If you did use the register keyword, then the compiler will bind the variable to a bad register’, if there is an 
unallocated register available. (A “hard” register is an actual machine register; in other circumstances, the compiler 
allocates “pseudo” registers, which are bound to “hard” registers by a later register allocation pass.) Otherwise, it will 
issue a fatal error message telling you that it could not bind the variable to a register. Because the compiler uses hard 
registers, each register variable declared in a function and referred to in in-line code reduces the number of registers 
available. If you declare too many register variables, you may leave too few registers, with the result that the register 
allocator will generate an error message complaining that it does not have enough "colorable" registers. 


When the compiler sees inline assembly code, it also makes worst case assumptions about the state of common sub- 
expressions (by marking all current common sub-expressions as if they were invalidated by the inline code) and of the 
state of the registers (by assuming all non-colorable registers are modified by the in-line code). 


It also assumes that your inline code does not contain any call instructions. If it does, then you should use the 
inline_asm_uses_call pragma. 





C wrappers for assembly coded functions 


If you want to write a C callable function in assembly code, it is often most convenient to code it as a C function whose 
body is a block of in-line code. The benefit is that, if you declare all variables to be used in C, then the compiler will 
look after saving/restoring register on entry/exit and will look after assigning register numbers. Such a function would 
have a structure like this: 


int 
f( register int x, register int y ) 
{ 
register int a, b, c; 
/$ 
....assembly code using @x, Bu, @a, @b, He 
$/ 
return c; 
} 


In this case, only the declarations and return statement are in C; everything else is in assembler. Note that the code uses 
the register keyword in declarations in order to force the compiler to put them into registers, so that they can be 
used as registers in the in-line code. 


Normally, when you declare a local variable in a function, the C compiler feels free to decide whether or not it will be 
kept in a register. However, if a variable if referenced in inline assembly code (/$ ... $/), the C compiler will only put 
the variable in a register if you use the register keyword. Otherwise, unless it is declared as static, it will be 
stored in the local stack frame. This allows you and the compiler to agree in advance as to how to access a C variable 
when you access it symbolically in your inline assembly code. 
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If, in inline assembly code, you refer to a register directly (by using the name of the register), then you must 
save/restore it on entry/exit to/from the inline assembly code. The best way to avoid such concerns is to always declare 
your registers as C variables and then refer to them using @ name in the inline assembly code. If you want to use a 
scratch register (r0-r1) which is not being used to hold an incoming function argument, then you should bind the C 
variable to the scratch register directly. For instance, to bind an int variable q to register r5 and to bind the long 
variable p to the register pair r6/xr7, you would write 


register int q @ bankI[5]; 
register long p @ bankI[6]; 


Please note also that the compiler’s internal name for the register set, which contains the general purpose registers, is 
"bank", 


In general, if the wrapper function does not contain any function calls, then arguments passed in registers will be used 
as is. Otherwise, they will be copied to other, non-scratch registers. When in doubt, look at the assembly code output 
file generated by the compiler, as it will contain compiler generated comments indicating which register it is using for 
each register variable. 


Debugging 
The front end of the compiler assumes there are an infinite number of registers in the machine, so that it can put any 
local, non-static variable (which has not had its address taken) into a register. Following code generation, the register 
allocator makes the number of registers used by the front end fit into the number of registers available. Where 
necessary, it generates spills and unspills to save and restore a register to it can be used for some other purpose. 


In order to generate the best possible code, the register allocator assigns a register to each individual live range of a 
variable. It does not necessarily use the same register for every live range. This means that (a) a local variable may be 
in different registers at different times and that (b) you can only view a local variable at a point where it has an active 
live range. If you want to have a local variable, which you can always easily see in the debugger, then you can do that 
by declaring it as static (as long as the function will not be called recursively or re-entered in some other way). 


If you want to have some idea as to what variables are being assigned to what registers, you can use the treginfo 
command line option. This option annotates the assembly file, generated by the compiler, with additional comments 
showing what variables are in what registers. 


Optimization 
By default, the compiler attempts to optimize your code as much as it can, without losing the ability to allow you to do 
C source level debugging. In doing so, it is able to do many of the commonly useful optimizations, including constant 
folding, common sub-expression elimination, some reduction in strength operations (e.g. replacing certain multiplies by 
shifts), to name just a few. 


The compiler also provides an optional, higher level of optimization, which does a variety of additional code 
improvements, including reduction in strength, constant propagation, loop induction variable elimination, global 
common sub-expression elimination, aliasing analysis, redundant load elimination, loop invariant removal, and dead 
code elimination. 


The principal benefit of using the global optimizer is that it will generally make your code run faster, especially if it 
contains loops, which use array subscripting. 


The principal disadvantages of using the global optimizer are that (1) it may make your program larger (because the 
optimizer may generate loop setup code to make loops run faster) and that (2) it will be harder to debug a globally 
optimized function because the aggressive nature of the optimizations may reorganize your code to such an extent that 
it is hard to see how the original source code has been mapped to assembler code. 
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It may not be a good idea to just blindly build or re-build your entire application with the global optimizer. For one 
thing, although we are justifiably proud of our global optimizer and although we have done our very best to test it 
thoroughly, it is still a relatively new body of code, which may contain as yet undiscovered bugs, so we recommend a 
certain amount of caution in using it. Also, the optimizer sometimes does not do well, when applied to a function, 
which has been hand-optimized in C (e.g. if you have already rewritten array references in loops to use pointers, instead 
of subscripts). 


You can selectively control which functions are built with the global optimizer turned on and which functions are not, 
by using the global optimizer pragmas, which are described above. Or, you can turn on the global optimizer, on a per 
file basis, by using the “+0” command line option. 





C Runtime Environment 


For the purposes of the C compiler, the Kalimba has a set of 10 general purpose registers, called r registers. 


Registers r0-r1 are global scratch registers (caller-saved), which can be used by anyone at anytime. If you call a C 
function from assembler, then you must save any of these registers, which are in use before the call, and then restore 
them afterward. 


Registers r3 to r8 are local registers (callee-saved), if you write an assembler function which is called from C, then 
you must save on entry and restore on exit any of these registers which you use in your function. 


Register r9 is reserved to for use as the C runtime stack pointer. 


Register r10 is reserved for use by the hardware do loop, and is used by the compiler when you write a structure 
assignment or when you write a (non-standard) loop statement in C. This register is treated as callee-saved, as well, so 
that it you use r10 in assembly code, called from C, then you must save r10 on entry and restore it on exit. 


The stack pointer always points to the last used location on the stack, so it can safely be used by interrupt service 
functions. The stack grows from high addresses towards low addresses. There is no check for stack overflow. This 
means that you have to guard against having the stack grow into the space occupied by global variables, by being 
careful about how you write your code. You should be very careful to not use up a lot of stack space by declaring large 
arrays and/or structures on the stack. 


Also, the rMAC register is treated as a global scratch register, which can be used by anyone at any time. On the other 
hand, the rMACB register is callee-saved. 


If you are writing assembler code, which is to be callable from C code, you must prefix the name of your assembler 
code entry point with an "$", because Kalimba C always prefixes its names with "$". For instance, the C function 
"main" is emitted with the name "Smain" in assembler. 


C Program Startup 


C code starts up with the function cstart. The cstart function sets up the stack pointer, and then calls your main 
function. Should main end or return, cstart will then loop forever until the processor is reset. 


Passing Arguments to C functions 


There are two cases to consider: (1) functions with a fixed number of arguments and (2) functions with a variable 
number of arguments (denoted by the presence of ... in the function argument list). 


In the case of a function with a fixed number of arguments, some of the arguments can be passed in registers, while any 
remaining arguments are passed on the stack. The rules for determining what arguments can be passed in registers are 
as follows. There are two registers available for arguments (r0-r1). The compiler goes through the argument list from 
left to right, checking each argument. If the argument's type is suitable for being put into a register and if there are 
enough argument registers left to hold the type, then the argument will be placed in the register selected, and the 
process repeats for the next argument. If the argument's type is not suitable for being put into a register or if there are 
not enough unallocated argument registers left to hold the argument, then the argument in question, and all subsequent 
arguments, are placed on the stack, being stored in order from right to left. 


A type is suitable for being placed in an argument register if it is any basic type (i.e. it is neither a structure nor a 
union). 


In the case of a function with a variable number of arguments, all arguments are pushed onto the stack from left to 
right, and none are passed in registers. It is very important that any call to a function with a variable number of 
arguments have a correct prototype for the called function in scope, so that the compiler will set up the call correctly. If 
there is no prototype in scope, the compiler will treat the call as having a fixed number of arguments, likely placing 
some arguments in registers instead of putting them on the stack where the called function expects them. The compiler 
is configured to warn you if you call a function which does not have a prototype in scope, by having the +wp command 
line option set by default. 


If a function returns a structured type (a struct or union), then the compiler prepends an additional "target" argument in 
front of the first argument. The target argument contains the address at which to store the structured type being returned 


by the function. If the function does not have a variable number of arguments, then this first argument register will use 
up the first of the two registers available for argument passing. 


Return values from C functions 


A function which returns a char, short, or int type will put its return value in the rO register. 


A function which returns a long or long long type will put its return value in the rO and r1 registers, with the 
most significant bits in r0 and the least significant bits in r1. 


A function, which returns a structured type, will copy the structure being returned to the address passed as the first 
"target" argument. 


Entry to a C function 


Upon entry to a C function, the compiler emits code to 
e Push the current frame pointer on to the stack and copy the current stack pointer to the frame pointer. 
e Allocate space for local variables by adding the amount of space required to the stack pointer. 
e Save all non scratch registers used in the function by using push or pushm instructions. 


The total amount of the space allocated by the addition to the stack pointer is called the frame size. 


If the function contains a call to a C function or a call to a C runtime library function, then the compiler will also emit 
code to push the rLink register, as well. 


The first thing the compiler does upon entering a function is to create its local stack frame, which contains local 
variables, and compiler temporary variables. It does this by adding a constant to the sp stack pointer. 


Next it uses push or pushm instructions to save any of the registers which the calling C function expects to be 


preserved. These include r2-r8, rmACB, and r10. If the function will call other functions, then there will also be 
code to save the rLink register, which contains the function return address that was set by the call instruction. The 
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register save area is at the end (highest addresses) of the stack frame, because it uses space that was allocated after the 
compiler created space for local variables by adding a constant to the stack pointer. 


Any incoming function arguments, which were passed on the stack, are located using negative offsets to the frame 
pointer register. For instance, the first 24 bit argument on the stack will be at fp-1, the second argument on the stack 
will be at fp-2, and so on. 


The compiler does not normally save r0-r1 and assumes that they are free to be used at any time. Of course, it also 
knows if rO and r1 contain arguments, then it will not use those registers unless it has either finished using the 
arguments or else moved the arguments elsewhere. However, it will save rO and r1 if the function is an interrupt 
function. 


If you declare any variables in C, which are bound to any of the address generation unit registers (I0-17, MO-M3, 
LO-L3) and which you then use in inline assembly code, the C compiler will save these registers on entry. 


Accessing Arguments passed on the stack 


The formula to access a incoming arguments on the stack and local variables is 
FE offset 


Where FP is the frame pointer register (fp), for incoming arguments, offset is the negative offset to the argument. 
For instance, if you have the function 


vöid £( int a, int b, dnt Ge int d j 
{ 
/* function body code */ 


then, on entry and after the previous frame pointer had been pushed onto the stack, a is in register r0, b is in register 
r1, cis at offset -1, and d is at offset -2. In general, offsets to arguments not in registers are computed by considering 
the arguments in order from left to right. 


Also note that the fp register points to the saved fp register and that local variables, if any, are located starting at fp 
ELE 


Exit from a C function 


On exit from a C function, the compiler emits code to undo what it did on entry. First, it uses pop or popm instructions 
to restore the registers it saved in reverse order, then it subtracts from the stack pointer the amount of space it reserved 
for local variables (if any). 


If you declare any variables in C, which are bound to any of the address generation unit registers (I0-17, MO-M3, 
LO-L3) and which you then use in inline assembly code, the C compiler will also use pop instructions to restore these 


registers on exit. 


If the function contains a call to a C function or a call to a C runtime library function, then the compiler will also emit 
code to pop the savedrLink register, as well. 


Finally, the compiler restores the frame pointer by popping it off the stack then issues an rts instruction, to return to 
the caller. 
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Function cleanup 


When your function returns, the code that called it must clean up the stack, if it had pushed any arguments on the stack 
for the call. To do so, it subtracts from the stack pointer the total size of the arguments pushed. This leaves the stack 
pointer in the state that it was just prior to generating code for the call. 


The include file "kalimba3.h" 


The "kalimba3.h" include file defines some variables and functions, which are very useful for programming the CSR 
Kalimba 3. 


All function names, defined in this file, are prefixed by at least one underbar, in order to avoid conflicts with names in 
user C code, as per the recommendation in the C Standard. 


Builtin (aka intrinsic) functions 


int satimul( int, int ); 
- this function does a 24 bit signed multiply of its operands. 


Using the Tools 


In this section, we will briefly describe how to compile, assemble, and link CSR Kalimba C and assembly code, which 
you do by using commands typed into a “command prompt” window. 


To begin with, you will be using, directly or indirectly, the following commands: 


MCPP - the C preprocessor 

MCC - the C compiler 

KAS - the CSR Kalimba 3 Assembler 
KLINK - the Kalimba 3 linker 


In the examples below, “<ctools_dir>” is the path, with drive letter if necessary, to the directory in which you installed 
the Archelon CSR Kalimba C tools, and “<kalasm_dir>” is the path, again including a drive letter if necessary, to the 
directory in which you installed CSR’s KALASM2. 


Before trying to run MCC, you must first set up a couple of environment variables, which will allow MCC to find, 
load, and run the code generator for the CSR Kalimba processor. To do so, you must set the environment variables 
MCSYS, MCDIR, and MCINCLUDE as follows: 


set MCSYS=kalimba3 
set MCDIR=<ctools dir>\mcdir 
set MCINCLUDE=<ctools dir>\include 





The mcdir directory contains the kalimba3.cif file, which contains the code generator. The include directory 
contains the Standard C include files for the Kalimba. 


To use these commands, you must put the directory in which each resides in your shell’s PATH environment variable. 
You can do this by entering 


set path="<ctools dir>\bin; spath%” 
set path="<kalasm_ dir>\bin; spath%” 


All but one of these commands will print a brief summary of its arguments on the standard output, when you type the 
command name with no arguments. The exception to this is MCPP. If it is invoked with no arguments, it assumes that it 
will be reading from the standard input and writing to the standard output. To get it to display its arguments, you must 
invoke it with an incorrect argument, as in ““MCPP +ha”. MCC outputs more than one screen full of text. To see it one 
screen full at a time, pipe it through more by typing “mcc | more”. 


The maximum command line length allowed for the MCC and MCPP commands is 512 bytes. If and when that is not 
enough, the MCC and MCPP commands support an “x=filename” command line option which allows command 


line options to be read from the file “filename”. Normally, you need not worry about invoking MCPP, since MCC 
invokes it automatically by default. 


Compiling 
To compile a C program, so as to create an object file for linking, you would usually type (at least) 
mcc file.c 


When you enter the above command line, the C compiler will pass the file £ile.c through the MCPP preprocessor, 
compile the resulting output file, and place the generated assembly code into file.asm. 


The MCC C compiler automatically turns on the command line options 

+wp +c 
The +wp option causes the compiler to issue a warning message whenever it sees a call to a function, which does not 
have a function prototype in scope. The +c means to include the original C source code as comments in the assembly 
output. 
If you wish to compile your entire source file with the optional global optimizer turned on, then you can do this by 


using the “+0” command line option. If you need finer control over which functions get compiler using the global 
optimizer, then you should instead use the global optimizer pragmas. 


Assembling 


To process an assembly language file into an object file, you will usually type 
kalasm2 -c -F<kalasm_dir>\groups.asm -F<kalasm_ dir>\default.asm file.asm 


where “<kalasm_dir>” is the path to the directory containing KALASM2 and its support files “groups . asm” and 
“default .asm”. The “-c” option tells KALASM2 to generate only generate an object file, omitting the linking step. 
The above command will create an object code file, called “file. kob 4” and a listing file, called ““file.1st”. 


Linking 


To link a collection of object files into an application ready for loading into the Kalimba, you just need to invoke 
KLINK, with all the desired object files as arguments. For more information about this, please see the CSR Kalimba 
documentation. 
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